Eagle job-aware scheduling: divide and ... reorder

نویسندگان

  • Pamela Delgado
  • Diego Didona
  • Florin Dinu
  • Willy Zwaenepoel
چکیده

We present Eagle, a new hybrid cluster scheduler for data-parallel programs, consisting of a centralized scheduler for long jobs and a set of distributed schedulers for short jobs. Eagle incorporates two new techniques: succinct state sharing and sticky batch probing. With succinct state sharing, the centralized scheduler informs the distributed schedulers of the placement of long jobs in a low-overhead way. The distributed schedulers then avoid worker nodes with long jobs to minimize head-of-line blocking. Combined with a small, dedicated partition for short jobs, succinct state sharing entirely eliminates head-of-line blocking of short jobs by long jobs. With sticky batch probing, the distributed schedulers queue probes for their tasks at various worker nodes, but when a worker node finishes a task, rather than executing the next task in its queue, it requests a new task from a distributed scheduler according to the desired scheduling discipline. We use sticky batch probing to implement a distributed approximation of SRPT (Shortest Remaining Processing Time) with starvation prevention. We have implemented Eagle as a Spark plugin, and we have measured job completion times for a subset of the Google trace on a 100-node cluster for a variety of cluster loads. We show that Eagle improves at all percentiles over Hawk, an earlier hybrid scheduler with which it shares a code base. We provide simulation results for larger clusters, different traces, and for comparison with other scheduling policies. Using traces from Cloudera, Google and Yahoo, we show that Eagle outperforms other scheduling disciplines at most percentiles, and is more robust against mis-estimation of task duration.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Eagle: A Better Hybrid Data Center Scheduler

Eagle is a new hybrid data center scheduler that considerably improves the job completion times for short jobs. Eagle builds on the Hawk hybrid scheduler, using a centralized scheduler for long jobs and distributed schedulers for short jobs. The main innovation in Eagle is that it provides an approximate and potentially slightly outof-date summary of the centralized scheduler state to the distr...

متن کامل

Fficient S Cheduling S Trategy Using C Ommunication a Ware S Cheduling for P Arallel J Obs in C Lusters

In the area of Computer Science, Parallel job scheduling is an important field of research. Finding a best suitable processor on the high performance or cluster computing for user submitted jobs plays an important role in measuring system performance. A new scheduling technique called communication aware scheduling is devised and is capable of handling serial jobs, parallel jobs, mixed jobs and...

متن کامل

Guides to Inventory Policy: Functions and Lot Sizes

But this is only one of the characteristic problems business managers face in dealing with production planning, scheduling, keeping inventories in hand, and expediting. Other questions just as perplexing and baffling when managers approach them on the basis of intuition and pencil work alone-are: How often should we reorder, or how should we adjust production, when sales are uncertain? What cap...

متن کامل

Security Aware Parallel and Independent Job Scheduling in Grid Computing Environments Based on Adaptive Job Replication

In grid environment, jobs may be scheduled to multiple machines across different administrative domains. However, grid security is a main hurdle to make the job scheduling decision secure, reliable and fault tolerant. A security-aware parallel and independent job scheduling algorithm in grid computing environment based on adaptive job replications was proposed. In risky and failure-prone grids,...

متن کامل

Energy Efficiency of Thermal-Aware Job Scheduling Algorithms under Various Cooling Models

One proposed technique to reduce energy consumption of data centers is thermal-aware job scheduling, i.e. job scheduling that relies on predictive thermal models to select among possible job schedules to minimize its energy needs. This paper investigates, using a more realistic linear cooling model, the energy savings of previously proposed thermal-aware job scheduling algorithms, which assume ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2016